1,160 research outputs found

    Rethinking Recurrent Latent Variable Model for Music Composition

    Full text link
    We present a model for capturing musical features and creating novel sequences of music, called the Convolutional Variational Recurrent Neural Network. To generate sequential data, the model uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music. Using the sequence-to-sequence model, our generative model can exploit samples from a prior distribution and generate a longer sequence of music. We compare the performance of our proposed model with other types of Neural Networks using the criteria of Information Rate that is implemented by Variable Markov Oracle, a method that allows statistical characterization of musical information dynamics and detection of motifs in a song. Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.Comment: Published as a conference paper at IEEE MMSP 201

    Transformer Based Multi-Source Domain Adaptation

    Full text link
    In practical machine learning settings, the data on which a model must make predictions often come from a different distribution than the data it was trained on. Here, we investigate the problem of unsupervised multi-source domain adaptation, where a model is trained on labelled data from multiple source domains and must make predictions on a domain for which no labelled data has been seen. Prior work with CNNs and RNNs has demonstrated the benefit of mixture of experts, where the predictions of multiple domain expert classifiers are combined; as well as domain adversarial training, to induce a domain agnostic representation space. Inspired by this, we investigate how such methods can be effectively applied to large pretrained transformer models. We find that domain adversarial training has an effect on the learned representations of these models while having little effect on their performance, suggesting that large transformer-based models are already relatively robust across domains. Additionally, we show that mixture of experts leads to significant performance improvements by comparing several variants of mixing functions, including one novel mixture based on attention. Finally, we demonstrate that the predictions of large pretrained transformer based domain experts are highly homogenous, making it challenging to learn effective functions for mixing their predictions.Comment: 12 pages, 3 figures, 5 table

    Validation of the Patient Activation Measure in a Multiple Sclerosis Clinic Sample and Implications for Care

    Full text link
    Purpose. Patient engagement in multiple sclerosis (MS) care can be challenging at times given the unpredictable disease course, wide range of symptoms, variable therapeutic response to treatment and high rates of patient depression. Patient activation, a model for conceptualising patients’ involvement in their health care, has been found useful for discerning patient differences in chronic illness management. The purpose of this study was to validate the patient activation measure (PAM-13) in an MS clinic sample. Methods. This was a survey study of 199 MS clinic patients. Participants completed the PAM-13 along with measures of MS medication adherence, self-efficacy, depression and quality of life. Results. Results from Rasch and correlation analyses indicate that the PAM-13 is reliable and valid for the MS population. Activation was associated with MS self-efficacy, depression and quality of life but not with self-reported medication adherence. Also, participants with relapse-remitting MS, current employment, or high levels of education were more activated than other subgroups. Conclusions. The PAM-13 is a useful tool for understanding health behaviours in MS. The findings of this study support further clinical consideration and investigation into developing interventions to increase patient activation and improve health outcomes in MS

    Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI

    Full text link
    Artificial Intelligence (AI) is currently spearheaded by machine learning (ML) methods such as deep learning (DL) which have accelerated progress on many tasks thought to be out of reach of AI. These ML methods can often be compute hungry, energy intensive, and result in significant carbon emissions, a known driver of anthropogenic climate change. Additionally, the platforms on which ML systems run are associated with environmental impacts including and beyond carbon emissions. The solution lionized by both industry and the ML community to improve the environmental sustainability of ML is to increase the efficiency with which ML systems operate in terms of both compute and energy consumption. In this perspective, we argue that efficiency alone is not enough to make ML as a technology environmentally sustainable. We do so by presenting three high level discrepancies between the effect of efficiency on the environmental sustainability of ML when considering the many variables which it interacts with. In doing so, we comprehensively demonstrate, at multiple levels of granularity both technical and non-technical reasons, why efficiency is not enough to fully remedy the environmental impacts of ML. Based on this, we present and argue for systems thinking as a viable path towards improving the environmental sustainability of ML holistically.Comment: 24 pages; 6 figure

    Comparison of Nest Defense Behaviors of Goshawks (Accipiter gentilis) from Finland and Montana

    Get PDF
    As human impacts on wildlife have become a topic of increasing interest, studies have focused on issues such as overexploitation and habitat loss. However, little research has examined potential anthropogenic impacts on animal behavior. Understanding the degree to which human interaction may alter natural animal behavior has become increasingly important in developing effective conservation strategies. We examined two populations of northern goshawks (Accipiter gentilis) in Montana and Finland. Goshawks in Finland were not protected until the late 1980s, and prior to this protection were routinely shot, as it was believed that shooting goshawks would keep grouse populations high. In the United States, Goshawk were not managed as predator control. Though aggressive nest defense has been characterized throughout North America, goshawks in Finland do not show this same behavior. To quantify aggression, we presented nesting goshawks with an owl decoy, a human mannequin, and a live human and recorded their responses to each of the trial conditions. We evaluated the recordings for time of response, duration of response, whether or not an active stimulus was present to elicit the response (i.e., movement or sound), and the sex of the bird making the response. We used t-Test with unequal variance to compare mean number of responses and response duration. Our results suggested that goshawks in Montana exhibit more aggressive nest defense behaviors than those in Finland. While this could be due to some biotic or abiotic factor that we were not able to control for in a study on such a small scale, it is also possible that the results from this study suggest another underlying cause, such as an artificial selection pressure created by shooting goshawks

    Longitudinal Citation Prediction using Temporal Graph Neural Networks

    Get PDF
    Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time. Prior work viewed this as a static prediction task. As papers and their citations evolve over time, considering the dynamics of the number of citations a paper will receive would seem logical. Here, we introduce the task of sequence citation prediction, where the goal is to accurately predict the trajectory of the number of citations a scholarly work receives over time. We propose to view papers as a structured network of citations, allowing us to use topological information as a learning signal. Additionally, we learn how this dynamic citation network changes over time and the impact of paper meta-data such as authors, venues and abstracts. To approach the introduced task, we derive a dynamic citation network from Semantic Scholar which spans over 42 years. We present a model which exploits topological and temporal information using graph convolution networks paired with sequence prediction, and compare it against multiple baselines, testing the importance of topological and temporal information and analyzing model performance. Our experiments show that leveraging both the temporal and topological information greatly increases the performance of predicting citation counts over time
    • …
    corecore